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Abstract 

Background: The role of n-3 fatty acids in prevention of breast cancer is well recognized, but the underlying 
molecular mechanisms are still unclear. In view of the growing need for early detection of breast cancer, Graham 
et al. (2010) studied the microarray gene expression in histologically normal epithelium of subjects with or without 
breast cancer. We conducted a secondary analysis of this dataset with a focus on the genes (n = 47) involved in 
fat and lipid metabolism. We used stepwise multivariate logistic regression analyses, volcano plots and false 
discovery rates for association analyses. We also conducted meta-analyses of other microarray studies using 
random effects models for three outcomes-risk of breast cancer (380 breast cancer patients and 240 normal 
subjects), risk of metastasis (430 metastatic compared to 1104 non-metastatic breast cancers) and risk of recurrence 
(484 recurring versus 890 non-recurring breast cancers). 

Results: The HADHA gene [hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl-CoA hydratase 
(trifunctional protein), alpha subunit] was significantly under-expressed in breast cancer; more so in those with 
estrogen receptor-negative status. Our meta-analysis showed an 18.4%-26% reduction in HADHA expression in 
breast cancer. Also, there was an inconclusive but consistent under-expression of HADHA in subjects with 
metastatic and recurring breast cancers. 

Conclusions: Involvement of mitochondria and the mitochondrial trifunctional protein (encoded by HADHA gene) 
in breast carcinogenesis is known. Our results lend additional support to the possibility of this involvement. Further, 
our results suggest that targeted subset analysis of large genome-based datasets can provide interesting 
association signals. 
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Background 

Early detection of malignant breast neoplasms is critical 
to cancer prevention and treatment. Cancer chemopre- 
vention (also called as treatment of carcinogenesis) is a 
primordial prevention step that is receiving considerable 
attention. In that context, the identification of an ideal 
biomarker for breast cancer has become increasingly 
important. In spite of the vast number of studies con- 
ducted in the past; a recent, comprehensive and elegant 
review argues that there is still a lack of clarity regarding 
the understanding of the process of breast carcinogenesis 
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[1]. Interestingly, it has been demonstrated that the 
mammary gland basal cells have features consistent with 
the progenitor stem cells and that they can differentiate 
into benign or malignant lesions intraductally [2]. It has 
also been shown in murine models that differentiated 
intact mammary glands can exert a negative influence on 
the development of breast cancer [3]. However, the 
search for an ideal breast cancer biomarker is still on [4]. 

A logical undertaking in this direction is the use of 
microarrays to study the differential gene expressions in 
breast cancer. Consistent with the spirit of research that 
encourages very early detection of carcinogenesis, Gra- 
ham et al. [5] recently studied histologically normal 
epithelium from subjects with and without breast cancer 
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and identified an 86-gene signature that indicated a 
genomic change prior to carcinogenesis. They found 
that many of these genes belonged to the family of 
growth factors, cytokines, oxidative stress modifiers, p38 
MAP kinase pathway members, transcription regulators 
or determinants of nucleic acid stability [5]. 

Interestingly they did not find genes associated with 
fatty acid or lipid metabolism to be differentially 
expressed in histologically normal epithelium. Derange- 
ments of fatty acid and lipid metabolism have been 
implicated in oncogenesis in many studies, especially in 
the cancer of breast [6]. It is generally believed that 
diets rich in n-6 polyunsaturated fatty acids (n-6 PUFA) 
and saturated fatty acids (SFA) increase the risk of 
tumorigenesis while diets rich in n-3 polyunsaturated 
fatty acids (n-3 PUFA) reduce the risk of cancer devel- 
opment [7-10]. Lipids have the ability to influence the 
process of neoplasia via their effects on hormone status, 
cell membrane integrity, signal transduction, immune 
modulation and regulation of gene expression [11,12]. 
In this study, we specifically examined whether the 
genes related to fatty acid and lipid metabolism are also 
differentially associated with breast cancer status. For 
this, we used a targeted subset analysis of the microar- 
ray data from Graham et al. [5] and also conducted 
meta-analyses of other microarray datasets. 

Methods 

The primary dataset 

The microarray dataset used in the present study is avail- 
able for public use on the Gene Expression Omnibus 
website http://www.ncbi.nlm.nih.gov/ sites/ GDSbrowser? 
acc=GDS3716 of the National Institutes of Health, USA. 
Details of the study subjects on whom these microarray 
studies were conducted have been described previously 
[5]. Briefly, the dataset comprises microarray data col- 
lected through Affymetrix Human Genome U133A plat- 
form that measures expression of 22,283 genes. The data 
were collected using histologically normal epithelium 
from four sets of subjects-those who underwent reduc- 
tion mammoplasty (n = 18), those who underwent pre- 
ventive mastectomy (n = 6), estrogen receptor positive 
(ER+) breast cancer patients (n = 9) and estrogen recep- 
tor negative (ER-) breast cancer patients (n = 9). The 
data were available in normalized format. 

Targeted subset analysis 

Our main aim was to assess if genes related with fatty 
acid and lipid metabolism were differentially expressed 
in the study dataset. For this we first culled a list of 
genes that have been implicated in the fatty acid and 
lipid metabolism. We used the DAVID http://david. 
abcc.ncifcrfgov and KEGG Pathway http://www.genome. 
jp/kegg/metabolism.html websites and generated a list of 



136 genes implicated in one or more of the following 
pathways: fatty acid metabolism, fatty acid biosynthesis, 
peroxisome proliferator-activated receptor (PPAR) sig- 
naling pathway, lipopolysaccharide biosynthesis, lipid 
metabolism and fat digestion and absorption. A full list 
with functional annotation of these 136 genes is pro- 
vided as Additional file 1: Table SI. We then used the 
DAVID http://david.abcc.ncifcrfgov and Clone/Gene ID 
Converter http://idconverter.bioinfo.cnio.es/IDconverter. 
php programs to find out which of these 136 genes 
were included in the Affymetrix Human Genome 
U133A platform. We found 47 probe sets related to 
genes (Table 1) that partake in lipid or fatty acid meta- 
bolism to be represented in the study datasets. We con- 
ducted our analyses on the potential differential 
expression of these 47 genes. Complete functional anno- 
tation for these 47 genes is provided in Additional file 2: 
Table S2. 

Replication of the results: meta-analyses 

We also aimed to ensure that the results obtained from 
one microarray dataset were robust and could be repli- 
cated in other datasets. We queried the Oncomine data- 
base and retrieved microarray data from other relevant 
studies. We studied the association of gene expression 
with three outcomes-risk of breast cancer, risk of 
metastasis and risk of recurrence. We then combined 
these datasets meta-analytically using the random effects 
model of DerSimonian and Laird [13,14]. For these ana- 
lyses the effect size was measured and expressed as the 
standardized mean difference (SMD) and its 95% confi- 
dence intervals. The Oncomine website reports the 
results as means, medians, quartiles and minimum and 
maximum values. Since the random-effects model 
assumes normal distribution of the effect measures, we 
first estimated the mean and standard error for each 
group (for example, for subjects with breast cancer; sub- 
jects with a metastatic event or subjects with recurring 
breast cancer) using the method described by Hozo et 
al. [15] We then estimated the SMDs. To depict the 
potential variability in the HADHA expression based on 
the probes used by individual studies, we conducted the 
meta-analyses separately for each combination of the 
study and the probe used. Each comparison represented 
a specific combination of the included study and the 
reporter used in the study. The between-study heteroge- 
neity in this meta-analysis was examined using the I 
statistic. Since expression data on all individual subjects 
was available for the outcome of risk of breast cancer, 
we also conducted individual patient data (IPD) meta- 
analysis [16]. For this we used the clustered uncondi- 
tional logistic regression analyses [16,17] with disease 
status as a dichotomous dependent variable, compari- 
son-specific z-scores as the predictor variable and 
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Table 1 Genes included In the analyses 



# 


Symbol 


Affymetrix Probe 
Set Id 


Gene Name 


1 


ACAAl 


202025, 


.x_at 


acetyl-Coenzyme A acyltransferase 1 


2 


ACADL 


206068_ 


_5_at 


acyl-Coenzyme A dehydrogenase, long chain 


3 


ACADM 


202502_ 


.at 


acyl-Coenzyme A dehydrogenase, C-4 to C-12 straight chain 


4 


ACATl 


20541 2_ 


.at 


acetyl-Coenzyme A acetyltransferase 1 


5 


ACSBG2 


221 71 6_ 


_5_at 


acyl-CoA synthetase bubblegum family member 2 


6 


ACSL3 


201 550_ 


.at 


acyl-CoA synthetase long-chain family member 3 


7 


ACSL4 


202422_ 


_5_at 


acyl-CoA synthetase long-chain family member 4 


8 


ACSL5 


21 8322_ 


_5_at 


acyl-CoA synthetase long-chain family member 5 


9 


ADHIA 


207820_ 


.at 


alcohol dehydrogenase IB (class 1), beta polypeptide; alcohol dehydrogenase lA (class 1), alpha polypeptide; 
alcohol dehydrogenase IC (class 1), gamma polypeptide 


10 


ADH6 


207544. 


_s_at 


alcohol dehydrogenase 6 (class V) 


1 1 


ADIFuQ 


207175. 


.at 


adiponectin, CIQ and collagen domain containing 


12 


AGPAT2 


210678. 


_s_at 


1-acylglycerol-3-phosphate 0-acyltransferase 2 (lysophosphatidic acid acyltransferase, beta) 


1 3 


AN(jP\l4 


221009. 


_s_at 


angiopoietin-like 4 


14 


A n/^ A A 

AHUA4 


206894. 


.at 


apolipoprotein A-IV 


15 


AP0C3 


205820. 


_s_at 


apolipoprotein C-lll 


16 


AQP7 


206955. 


.at 


aquaporin 7 


1 7 


ARSA 


204443. 


.at 


ary sulfatase A 


18 


A5AH1 


210980. 


_s_at 


N-acylsphingosine amidohydrolase (acid ceramidase) 1 


19 


CERK 


218421. 


.at 


ceramide kinase 


20 


CETP 


206210. 


_s_at 


cholesteryl ester transfer protein, plasma 


21 


CYP7A 1 


207406. 


.at 


cytochrome P450, family 7, subfamily A, polypeptide 1 


22 


DGATl 


202344. 


.at 


diacylglycerol 0-acyltransferase homolog 1 (mouse) 


23 


trinAUn 


205222. 


.at 


enoyl-Coenzyme A, hydratase/3-hydroxyacy Coenzyme A dehydrogenase 


24 


FABP2 


207475. 


.at 


fatty acid binding protein 2, intestinal 


25 


FUT2 


208505. 


_5_at 


fucosyltransferase 2 (secretor status included) 


26 


FUT4 


209892. 


.at 


fucosyltransferase 4 (alpha (1,3) fucosyltransferase, myeloid-specific) 


27 


FUT5 


2 1 0398. 


.x_at 


fucosyltransferase 5 (alpha (1,3) fucosyltransferase) 


28 


FUT9 


207696. 


.at 


fucosyltransferase 9 (alpha (1,3) fucosyltransferase) 


29 


GCDH 


203500. 


.at 


glutaryl-Coenzyme A dehydrogenase 


30 


GK2 


21 5430. 


.at 


glycerol kinase 2 


31 


GLA 


214430. 


.at 


galactosidase, alpha 


32 


HADHA 


208629. 


_5_at 


hydroxyacyl-Coenzyme A dehydrogenase/3-ketoacyl-Coenzyme A thiolase/enoyl-Coenzyme A hydratase 
(trifunctional protein), alpha subunit 


33 


LTA 


206975. 


.at 


ymphotoxin alpha (TNF superfamily, member 1) 


34 


MTTP 


205675. 


.at 


microsomal triglyceride transfer protein 


35 


NPCILI 


220106. 


.at 


NPC1 (NIemann-PIck disease, type CI, gene)-like 1 


36 


NR1H3 


203920. 


.at 


nuclear receptor subfamily 1, group H, member 3 


37 


OLRl 


210004. 


.at 


oxidized low density lipoprotein (lectin-like) receptor 1 


38 


rLK2 


202847. 


.at 


phosphoenolpyruvate carboxykinase 2 (mitochondria ) 


39 


PLTP 


200661. 


.at 


phospholipid transfer protein 


40 


PPARG 


208510. 


_s_at 


peroxisome proliferator-activated receptor gamma 


41 


RXRB 


209148. 


.at 


retinoid x receptor, beta 


42 


RXRG 


205954. 


.at 


retinoid x receptor, gamma 


43 


SGMSl 


212989. 


.at 


sphingomyelin synthase 1 


44 


5T8SIA1 


210073. 


.at 


ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 1 


45 


UCPl 


221384. 


.at 


uncoupling protein 1 (mitochondrial, proton carrier) 
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Table 1 Genes included In the analyses (Continued) 

45 UCP3 207349_s_at uncoupling protein 3 (mitocliondrial, proton carrier) 

47 UGCG 204881_s_at UDP-glucose ceramide glucosyltransferase 



comparison indicator as the clustering variable. Compar- 
ison-specific z-scores were estimated as the relative 
deviates (mean expression/standard deviation of expres- 
sion) within each comparison group. 

Other statistical analysis 

To quantify and test differential gene expression, we 
used two-tailed Student's t tests for unpaired samples. 
The clinical and statistical significance of the findings 
were presented as volcano plots. To account for multi- 
ple testing, we estimated the false discovery rates (q 
values) using the QVALITY software program [18]. Dis- 
criminant utility of each gene was assessed using non- 
parametric receiver operating characteristic (ROC) curve 
analysis. To group subjects based on their HADHA 
expression, we used a k-means clustering approach. All 
statistical analyses were conducted using Stata 10.0 soft- 
ware package (Stata Corp, College Station, Texas). We 
aimed for a type I error rate of 0.05 and a false discov- 
ery rate of 0.15. 

Results 

Differential expression analyses 

Using the shortlisted set of 47 genes shown in Table 1, 
we first determined if these genes were differentially 
expressed in subjects with cancer (n = 18) and those 
without (n = 24). The volcano plot (Figure lA) showed 
that seven of the 47 genes were significantly differentially 
expressed between these study groups. These genes 
included five over-expressed genes {AQP7, PLTP, PCK2, 
GCDH and ARSA) and two under-expressed genes 
{ACSL5 and HADHA). Of these, HADHA was the most 
significant statistically. To account for the possible covar- 
iance among these gene expression values we conducted 
stepwise multivariate analyses using unconditional logis- 
tic regression and observed that only two genes-/i4D/i4 
and ARSA-weie retained in the final model (Figure IB). 
This model explained 35% of inter-individual variability 
in breast cancer susceptibility with a predictive accuracy 
of 86.8%. Interestingly, when HADHA expression was 
removed from this model the ARSA lost its statistical sig- 
nificance but removal of ARSA did not affect the statisti- 
cal significance of HADHA. This indicates that HADHA 
gene expression was the most important statistical pre- 
dictor of altered risk of breast cancer. 

Does ER status influence the expression of HADHAl 

To examine if this association could be influenced by 
the ER status, we conducted three sets of analyses. First, 



we studied whether HADHA expression was different 
based on the ER status. We found that the mean 
HADHA was not significantly differentially expressed by 
ER status (mean HADHA expression in subjects with ER 
+ breast cancer = 6.00; in subjects with ER-breast cancer 
= 5.90; p = 0.462). Second, we adjusted the standard 
error estimates for the ER status using clustered logistic 
regression and observed that the statistical significance 
for the HADHA gene expression further increased {p = 
0.0001) while that of the ARSA gene decreased {p = 
0.082) indicating that the association of HADHA was 
unlikely to have been influenced by the ER status. 
Third, we constructed volcano plots and conducted 
stepwise logistic regression analyses by comparing the 
ER + and ER-subjects separately with subjects without 
cancer as the reference group. We observed (Figure IC- 
F) that HADHA gene expression was the only consistent 
predictor across ER status but more so in the ER-sub- 
jects. Indeed, the q value for the HADHA gene was 0.15 
for the cancer versus no cancer comparison, 0.13 for the 
ER-versus no cancer comparison but 0.88 for the ER + 
versus no cancer comparison. Two other genes {UCP3 
and DGATl) were retained in the final model of step- 
wise regression analyses when ER-subjects were com- 
pared to the no cancer group however this association 
was not observed when ER + subjects were compared to 
the same reference group. 

Graded risk of breast cancer based on HADHA expression 

We next considered whether the association of HADHA 
gene expression with risk of breast cancer exhibited a 
threshold effect or whether it was a graded dose- 
response. For this, we used two approaches. First, we 
normalized the gene expression in the no cancer group 
to 100%. We found (Figure IG) that the HADHA 
expression had fallen to 73% (95% CI 64%-83%) in sub- 
jects with cancer; with a higher expression in ER + sub- 
jects (76% of the no cancer group, 95% CI 61%-91%) 
than in ER-subjects (70% of the no cancer group, 95% 
CI 55%-85%). Second, the k-means clusters (which 
explained 95.9% of the variability in HADHA expression) 
clearly demonstrated a dose-response association (Figure 
IH) such that more severe down-regulation of HADHA 
was associated with a greater risk of being in the breast 
cancer group. 

Meta-analyses of the differential expression of HADHA 

Lastly, we examined the robustness of the differential 
expression of HADHA by conducting meta-analysis of 
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Figure 1 Association of fatty acid and lipid metabolism related genes with the risk of breast cancer. (A-F) Association analyses Pane s 
A, C and E show the volcano plots for cancer with no cancer, ER-versus no cancer and ER + versus no cancer comparisons, respectively. These 
plots depict the biological significance (log-fold change) on the X-axis and the statistical significance (-log P) on the Y-axis. Significance values 
above 0.1 are indicated by the grey shaded area in the volcano plots. Panels, B, D and F show the corresponding receiver-operating 
characteristic (ROC) curves for the final models from stepwise logistic regression analyses. The genes retained in the final model and their 
statistical significance is shown under the ROC curves, the variance explained by the model is shown as R^ and the predictive accuracy is 
indicated by the area under the ROC curve (AUC). (G) Comparative expression of the HADHA gene in the indicated study groups. Error bars 
indicate 95% confidence intervals. (H) Bubble plot showing the dose-response relationship between HADHA expression and the risk of breast 
cancer. Each bubble represents one of the six clusters generated using k-means clustering algorithm based on the HADHA expression. The radius 
of the bubble is proportional to the number of subjects in that cluster (indicated by numbers next to the bubbles). 
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published microarray studies comparing cases of breast 
cancer with subjects without breast cancer. Querying 
the Oncomine database, we found six studies [19-24] 
that represented 20 different comparisons of breast can- 
cer patients with normal subjects (Table 2). The reasons 
for this larger number of comparisons were the different 
reporters used in the microarray experiments as well as 
the different subtypes of breast cancer reported by the 
studies. 

We first observed that the mean expression levels for 
HADHA probes (expressed as log transformed values) 
were widely different across the six studies (Zhao et al. 
[24]:-0.33, Radvanyi et al. [21]: 3.08, Richardson et al. 
[22]: 5.29, Karnoub et al. [20]: 2.97, Turashvili et al. 
[23]: 3.92 and Finak et al. [19]:-2.65). We therefore 
transformed these values into comparison-specific z- 
scores (mean expression for a comparison/standard 
deviation of expression for that comparison). Upon this 
z-transformation, all the studies had a mean z-score of 0 
and a standard deviation of 1. We conducted meta-ana- 
lyses on the HADHA expression z-scores. Using the 
DerSimonian and Laird model, we observed (Figure 2) 
that the summary SMD (filled diamond in Figure 2) 
was-0.48 (95% CI-0.84-0.11). Considering the statistical 
properties of SMD it is possible to transform this into 
probability [25]. This transformation indicated that 
there was an average 18.4% reduction in expression of 
HADHA (95% CI 4.5%-3G.0%) in cases of breast cancer 



as compared to normal subjects. Interestingly, this sig- 
nificant reduction in the expression of HADHA was 
observed in spite of the high degree of heterogeneity (I 
64.6, p < 0.001, pie-chart in Figure 2) between the com- 
parisons due to different cancer subtypes, reporters used 
in various studies and other study characteristics. 

We observed that the invasive ductal carcinoma (p = 
0.046) and unspecified invasive breast carcinoma (p = 
0.005) showed a significant under-expression of HADHA 
gene but lobular carcinoma (p = 0.781), invasive lobular 
carcinoma (p = 0.780) or invasive mixed carcinoma (p = 
0.717) did not show a significant alteration of HADHA 
gene expression. Alternatively, we conducted the IPD 
meta-analysis using logistic regression analyses. We 
found that the odds ratio for breast cancer was 0.74 
(95% CI 0.60-0.92) after clustered analyses. Thus, there 
was a 26% reduction in the risk of breast cancer per 
unit increase in z-scores. These values show a striking 
resemblance with the findings observed in the Graham 
et al. dataset and demonstrate the replicability of our 
findings. 

We also investigated if HADHA expression was asso- 
ciated with an altered risk of metastasis and recurrence. 
For risk of a metastatic event we found nine studies 
[26-33] representing 430 metastatic events and 1104 
metastasis-free cancers (Figure 3). Subjects who devel- 
oped a metastatic event during follow-up had a reduced 
HADHA expression (summary effect size-0.65, 95% CI- 



Table 2 Comparisons included in the meta-analysis of differential HADHA expression 



No 


Author, Year 


Ref 


Controls 


Cases 


Breast cancer histology 


Reporter 


1 


Zhao, 2004 


[24] 


3 


37 


hvasive ductal carcinoma 


IMAGE:1 473300 


2 


Zhao, 2004 


[24] 


3 


21 


Lobular carcinoma 


IMAGE:1473300 


3 


Radvanyi, 2005 


[21] 


9 


7 


Invasive lobular carcinoma 


BE297873 


4 


Radvanyi, 2005 


[21] 


9 


32 


Invasive ductal carcinoma 


BE297873 


5 


Radvanyi, 2005 


[21] 


9 


3 


Invasive mixed carcinoma 


BE297873 


6 


Radvanyi, 2005 


[21] 


9 


3 


Ductal carcinoma in situ 


BE297873 


7 


Richardson, 2006 


[22] 


7 


40 


Ductal carcinoma 


208629_s_at 


8 


Richardson, 2006 


[22] 


7 


40 


Ductal carcinoma 


208630_at 


9 


Richardson, 2006 


[22] 


7 


40 


Ductal carcinoma 


20863 l_s_at 


10 


Karnoub, 2007 


[20] 


15 


7 


Invasive ductal carcinoma 


208629_s_at 


11 


Karnoub, 2007 


[20] 


15 


7 


Invasive ductal carcinoma 


208630_at 


12 


Karnoub, 2007 


[20] 


15 


7 


Invasive ductal carcinoma 


20863 l_s_at 


13 


Turashvili, 2007 


[23] 


20 


5 


Invasive ductal carcinoma 


208629_s_at 


14 


Turashvili, 2007 


[23] 


20 


5 


Invasive ductal carcinoma 


208629_s_at 


15 


Turashvili, 2007 


[23] 


20 


5 


Invasive ductal carcinoma 


208630_at 


16 


Turashvili, 2007 


[23] 


20 


5 


Invasive lobular carcinoma 


208630_at 


17 


Turashvili, 2007 


[23] 


20 


5 


Invasive lobular carcinoma 


20863 1_s_at 


18 


Turashvili, 2007 


[23] 


20 


5 


Invasive lobular carcinoma 


20863 l_s_at 


19 


Finak, 2008 


[19] 


6 


53 


Invasive breast carcinoma 


A_24_P242688 


20 


Finak, 2008 


[19] 


6 


53 


Invasive breast carcinoma 


A_24_P353964 
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Author, year; reporter 



SMD (95% CI) 



Weight 



Zhao, 2004; IMAGE:1473300 
Zhao, 2004; IMAGE:1473300 
Radvanyi, 2005; BE297873 
Radvanyi, 2005; BE297873 
Radvanyi, 2005; BE297873 
Radvanyi, 2005; BE297873 
Richardson, 2006; 208631_s_at 
Richardson, 2006; 208630_at 
Richardson, 2006; 208629_s_at 
Turashvili, 2007; 208630_at 
Turashvili, 2007; 208629_s_at 
Karnoub, 2007; 208631_s_at 
Karnoub, 2007; 208629_s_at 
Turashvili, 2007; 208631_s_at 
Turashvili, 2007; 208630_at 
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Figure 2 Meta-analysis of the differential expression of the HADHA gene In breast cancer compared to subjects without breast cancer 

The figure shows a forest plot with filled diamonds indicating the point estimates and error bars indicating the 95% confidence interval around 
the standardized mean difference. The summary effect measure is shown as a filled diamond whose center (dashed vertical line) indicates the 
point estimate and the width indicates the 95% confidence interval. 



1.47-0.16) but this was not statistically significant (p = 
0.117). Also, there was a very high degree of between- 
study heterogeneity (I^ 98.8%). Similarly, for the out- 
come of the risk of recurrence (Figure 4), we found that 
there were 10 studies [19,27,29,30,32,34-38] representing 
484 recurring and 890 non-recurring breast cancers. 
Meta-analysis demonstrated that although there was a 
consistent decrease in average HADHA expression in 
patients with a recurring form of breast cancer (sum- 
mary effect size-0.60, 95% CI-1.44-0.24), the finding was 
neither statistically significant {p = 0.160) nor homoge- 
neous (I^ = 98.7%) across studies. 

Discussion 

Our analyses of the microarray dataset based on the 
Graham et al. [5] study demonstrated a consistent, 
strong and significant association of the HADHA gene 
expression in histologically normal epithelium with the 



likelihood of breast cancer. Moreover, this observation 
was further substantiated by the meta-analysis of other 
published studies. Only one study has previously 
reported differential association of this gene with regard 
to BRCAl positive, BRCA2 positive and sporadic malig- 
nant tumors of the breast [39]. Our results further sup- 
port the putative involvement of HADHA in breast 
cancer susceptibility. 

Biological plausibility 

Biological significance of our novel observations should 
be considered in the light of the following facts. First, the 
HADHA gene (chromosomal location 2p23) codes for the 
four alpha chains in the 8-meric mitochondrial trifunc- 
tional protein (TFP) [40]. This enzyme performs three 
cardinal functions in the P -oxidation of long chain fatty 
acids by catalyzing the activities of the 2-enoyl-CoA 
hydratase (ECH), L-3-hydroxyacyl-CoA dehydrogenase 



Mamtani and Kulkarni BMC Research Notes 2012, 5:25 
http://www.biomedcentral.eom/1756-0500/5/25 



Page 8 of 1 1 



N 



Aiithnr vpar" rPTinrtPT 

LllWJf ywClIf Iwl-JWI Lwl 






SMD (95% CI) 


Weight 


vandeVijver et al, 2002; NM_0001 82 




♦ 


1.70(1.42,1.98) 


4.83 


vantVeer et al, 2002; NM_0001 82 




- 


-0.07 (-0.47, 0.33) 


4.80 


Minn et al, 2005; 208629_s_at 






1.69(1.16,2.22) 


4.76 


Minn et al, 2005; 208630_at 




-♦- 


2.05(1.49,2.61) 


4.75 


Minn et al, 2005; 208631_s_at 




-♦- 


1.69(1.16,2.22) 


4.76 


Desmedt et al, 2007; 208529_at 






-3.23 (-3.68, -2.78) 


4.79 


Desmedt et al, 2007; 208630_at 






-377 (-4.26, -3.28) 


4.77 


Desmedt etal, 2007; 208631_s_at ♦ 






-11.46 (-12.64, -10.29) 


4.41 


Chin et al, 2007; 02-026274052 


♦ 




-1.39 (-1.78, -1.00) 


4.80 


Loi et al, 2007; 208629_s_at 


■ 




0.27 (-0.18, 0.72) 


4.79 


Loi et al, 2007; 208630_at 


■ 




0.30 (-0.1 5, 0.75) 


4.79 


Loi etal, 2007; 208631_s_at 


A 


- 


-0.12 (-0.57, 0.33) 


4.79 


Loi et al, 2008; 208629_s_at 


-4 


- 


-0.28 (-0.95, 0.38) 


4.71 


Loi et al, 2008; 208630_at 




-♦- 


1.82(1.10,2.55) 


4.68 


Loi etal, 2008; 208631_s_at 




-♦- 


1.52(0.81,2.23) 


4.69 


Schmidt et al, 2008; 208629_s_at 


■♦■ 




-2.50 (-2.91, -2.09) 


4.80 


Schmidt et al, 2008; 208630_at 






-3.47 (-3.95, -3.00) 


4.78 


Schmidt et al, 2008; 208631_s_at 


♦ 




-1.01 (-1.35,-0.66) 


4.82 


Kao etal, 201 l;208529_at 




■ 


-0.20 (-0.45, 0.05) 


4.84 


Kao etal, 201 l;208630_at 




♦ 


1.68(1.39, 1.96) 


4.83 


Kaoetal,2011;208631_s_at 






0.34 (0.09, 0.59) 


4.84 


Overall 


o 


► 


-0.65 (-1.47, 0.16) 


100.00 


1 


1 



-12.6 < Underexpressed o Overexpressed >■ 12.6 

Figure 3 Meta-analysis of the differential expression of the HADHA gene in breast cancer patients with and without metastatic events. 

The figure shows a forest plot with filled diamonds indicating the point estimates and error bars indicating the 95% confidence interval around 
the standardized mean difference. The summary effect measure is shown as a hollow diamond whose center (dashed vertical line) indicates the 
point estimate and the width indicates the 95% confidence interval. 



(HACD) and 3-ketoacyl-CoA thiolase (KACT). Of these 
three, the first two enzymes (ECH and HACD) are speci- 
fically catalyzed by the alpha chains of TFP. Severe defi- 
ciency (< 50% of normal activity) of TFP is known to be 
associated with life-threatening manifestation of the long 
chain 3-hydroxyacyl-CoA dehydrogenase deficiency [41]. 
However, the effects of a milder deficiency of TFP (for 
example, activity between 50%-80% of the normal) are 
currently unknown. Our results indicate that breast can- 
cer patients had 18-30% decreased expression of HADHA 
gene. We therefore hypothesize that there may be a com- 
promised metabolism of long chain fatty acids in breast 
cancer due to a relative deficiency of the alpha chains of 
TFP. In this context, it is noteworthy that a recent large 
genome-wide association study [42] found a strong asso- 
ciation of breast cancer with a polymorphism in the gene 
encoding enoyl CoA hydratase domain containing 1 
(ECHDCl), which also partakes in the integrity of the 
TFP. 



Second, the efficacy of [3 -oxidation of n-3 and n-6 
long chain fatty acids can be tissue- and location- speci- 
fic. For example, in rat livers it has been shown that the 
n-3/n-6 ratio influences peroxisomal but not mitochon- 
drial P -oxidation [43]. In contrast, mitochondrial [3 -oxi- 
dation of long chain fatty acid has been implicated in 
breast cancer pathogenesis [42]. We also could not 
demonstrate a significant association of the genes 
involved in the PPAR-y pathway reinforcing the possibi- 
lity that mitochondrial rather than peroxisomal P -oxida- 
tion of long chain fatty acids may be more critical in 
breast carcinogenesis. Third, HADHA occupies an 
important position in the network of genes that have 
been implicated in autophagy and apoptosis [44]. Finally, 
triangulation of the following facts lends additional cre- 
dence to our observations: i) intact epithelium of mam- 
mary glands has the ability to act as stem cells for 
carcinogenesis [2]; ii) n-3 long chain fatty acids have the 
ability to target such stem cells [45]; and iii) HADHA is 
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involved in the mitochondrial P -oxidation of long chain 
fatty acids. Together these observations from published 
literature strongly support the biological plausibility of 
our finding that HADHA is differentially expressed in 
subjects with and without breast cancer. 

Limitations 

Our study has all the limitations implicit in any micro- 
array association study and meta-analyses. In addition, 
there are three more limitations. First, although there is 
a strong circumstantial evidence that favors an inference 
of HADHA expression-breast cancer association, it must 
be realized that robust functional studies are required 
before this association can be conclusively claimed. Our 
study does not have a component of functional assays 
that can help put these results in a biological perspec- 
tive. Second, due to limitations imposed by the microar- 
ray platform used in the primary study, we could not 
evaluate the potential association of a large number of 
additional lipid and fat metabolism related genes with 



the risk of breast cancer. Inclusion of those genes may 
not only affect the q values associated with HADHA but 
also may provide a more comprehensive understanding 
of the role of fatty acids in breast cancer. Thirdly, 
although consistent, the observed differential expression 
of HADHA with cancer progression (as reflected by risk 
of metastasis and recurrence) is statistically inconclusive. 

Conclusions 

Our study has three important implications-biological, 
methodological and epidemiologic. Biologically, our 
study has identified a novel target gene that corrobo- 
rates the existing knowledge about the role of long 
chain fatty acids in breast cancer and provides interest- 
ing directions for further research in this area. Also, our 
findings put the focus on the putative functional aspects 
of mitochondria and TFP in breast carcinogenesis. 

From a methodological standpoint, our study shows 
that high dimensionality of omics-type datasets is 
fraught with the vexing problem of finding strong 
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associations at the cost of potentially missing weaker but 
biologically meaningful associations. Literature addres- 
sing the issue of multiple comparisons in large volume 
datasets focuses primarily on the possibility of finding 
false positive associations [46]. However, there exists a 
demonstrable probability that such high-volume datasets 
may also falsely mask true associations. It is likely that 
the Graham et al. study did not report a significant 
association of HADHA with the risk of breast cancer 
due to a large number of multiple comparisons. The 
fact that we discovered an association of HADHA with 
breast cancer shows that microarray dataset analysis (as 
well as analyses of other large datasets like genome-wide 
association studies, proteomics data or metabolomics 
datasets) may benefit by using targeted subset analyses 
based on functional annotation and conceptual under- 
standing of the molecular mechanisms in disease. 
Finally, in an epidemiological context, our study shows 
that error in long chain fatty acid metabolism in the 
breast tissue might herald the onset of carcinogenesis 
and thus can be helpful for the primordial prevention of 
breast cancer. 

Additional material 



Additional file 1: Table SI. Excel table containing detailed annotation 
of the 136 genes related to fat and lipid metabolism that were primarily 
selected for analyses. 

Additional file 2: Table S2. Excel table containing detailed annotation 
of the 47 genes included in this study related to fat and lipid 
metabolism that were primarily selected for analyses. 
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